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Abstract. Traditionally, transfer functions have been designed manually for each op- 
eration in a program, instruction by instruction. In such a setting, a transfer function 
describes the semantics of a single instruction, detailing how a given abstract input state 
is mapped to an abstract output state. The net effect of a sequence of instructions, a 
basic block, can then be calculated by composing the transfer functions of the constituent 
instructions. However, precision can be improved by applying a single transfer function 
that captures the semantics of the block as a whole. Since blocks are program-dependent, 
this approach necessitates automation. There has thus been growing interest in comput- 
ing transfer functions automatically, most notably using techniques based on quantifier 
elimination. Although conceptually elegant, quantifier elimination inevitably induces a 
computational bottleneck, which limits the applicability of these methods to small blocks. 
This paper contributes a method for calculating transfer functions that finesses quantifier 
elimination altogether, and can thus be seen as a response to this problem. The practicality 
of the method is demonstrated by generating transfer functions for input and output states 
that are described by linear template constraints, which include intervals and octagons. 



In model checking [3] the behaviour of a program is formally specified with a model. Using 
the model, all paths through the program are then exhaustively checked against its require- 
ments. The detailed nature of the requirements entails that the program is simulated in a 
fine-grained way, sometimes down to the level of individual bits. Because of the complexity 
of this reasoning there has been much interest in abstracting away from the detailed nature 
of states. Then, the program checker operates over classes of related states — collections 
of states that are equivalent in some sense — rather than individual states. 
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1.1. Program analysis by abstract interpretation. Abstract interpretation [26j pro- 
vides a systematic way to construct such program checkers. The key idea is to simulate the 
execution of each concrete operation g : C — > C in a program with an abstract analogue 
/ : D — > D where C and D are domains of concrete values and descriptions, respectively. 
Each abstract operation / is designed to faithfully model its concrete counterpart g in the 
sense that if d £ D describes a concrete value c 6 C, sometimes written relationally as 
d oc c |55| . then the result of applying g to c is described by the action of applying / to 
d, that is, f(d) oc g(c). Even for a fixed set of abstractions, there are typically many ways 
of designing the abstract operations. Ideally the abstract operations should compute ab- 
stractions that are as descriptive, that is, as accurate as possible, though there is usually 
interplay with accuracy and complexity, which is one reason why the literature is so rich. 
Normally the abstract operations are manually designed up front, prior to the analysis itself, 
but there are distinct advantages in synthesising the abstract operations from their concrete 
versions as part of the analysis itself, in a fully automatic way, which is one reason why the 
topic is attracting increasing attention [131 [HI [50j [STJ [581 [651 EE] • 

1.2. The motivation for automatic abstraction. One reason for automation stems 
from operations that arise in sequences that are known as blocks. Suppose that such a 
sequence is formed of n concrete operations g\, 52, ■ ■ ■ , 5 n > an d each operation gi has its own 
abstract counterpart fi, henceforth referred to as its transfer function [46]. Suppose too that 
the input to the sequence c G C is described by an input abstraction d € D, that is, d oc c. 
Then the result of applying the n concrete operations to the input (one after another) is 
described by applying the composition of the n transfer functions to the abstract input, that 
is, f n {. . . f2(fi{d))) oc g n {. . . 52 (51(c))). However, a more accurate result can be obtained 
by deriving a single transfer function / for the block g n ° ■ ■ - ° 92° gi as a whole, designed so 
that f(d) oc g n (. . . 52 (51(c))). The value of this approach has been demonstrated for linear 
congruences [H] in the context of verifying bit-twiddling code |50| . 

To illustrate this interplay between block-level abstraction and precision, consider a 
block consisting of three instructions x := y - x; y := y - x; x := x + y that swaps the val- 
ues of the variables x and y without using a third variable \86\ Chap. 2.19]. To aid 
reasoning about the block as a whole, fresh variables are introduced, static single assign- 
ment [30] style, so as to separate different assignments to the same variable. This gives 
x" := y — x; y' := y — x"; x' := x" + y' where x" is an intermediate and x and x' (resp. y 
and y') represent the values of the variable x (resp. y) on entry and exit from the block. 
Since x" = y — x A y' = y — x" A x' = x" + y' |= y' = x A x' = y it follows that cumulatively 
the block can be described by a pair of two variable equalities x' = y A y' = x which can 
be interpreted as transfer function for the block. From this transfer function it follows that 
if x = 1 holds on entry to the block then y' = 1 holds on exit. Note that equalities x = 1 
and y' = 1 are considered to be two-variable since they contain no more than two variables. 
Now consider applying transfer functions for each of the three assignments in turn. Again, 
suppose that x = 1 holds prior to the assignment x" := y — x. Since the ternary constraint 
x" = y — 1 cannot be expressed within the two-variable equality domain then the best that 
can ever be inferred by any transfer function operating over this domain is x = 1 for the 
post-state. Likewise the best that can be inferred for a transfer function that simulates y' 
:= y — x" is x = 1 for its post-state, and similarly for x' := x" + y' . Thus by composing 
transfer functions over two-variable equalities one cannot show that y' = 1 holds on exit 
from the block. Therefore, the transfer function for a block can be strictly more precise 
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than the composition of the transfer functions for the constituent instructions. Since blocks 
are program-dependent, such an approach relies on automation rather than the manual 
provision of transfer functions for each instruction. 

Another compelling reason for automation is the complexity of the concrete operations 
themselves; a problem that is heightened by the finite nature of machine arithmetic. For 
instance, even a simple concrete operation, such as increment by one, is complicated by the 
finite nature of computer arithmetic: if increment is applied to the largest integer that can 
be stored in a word, then the result is the smallest integer that is representable. As the 
transfer function needs to faithfully simulate concrete increment, the corner case inevitably 
manifests itself (if not in the transfer function itself then elsewhere |78] ) . 

The problem of deriving transfer functions for machine instructions, such as those of the 
x86, is particularly acute [5] since these operations not only update registers and memory 
locations, but also side effect status flags [9j[75], of which there are many. When deriving 
a transfer functions for a sequence of machine instructions it is necessary to reason about 
how the status flags are used to pass state from one instruction to another. To illustrate 
the importance of status flags, consider double-length addition, where the operands are 
pairs of 32-bit words (xi,xq) and (yi,yo)> the result is denoted (z±,zo), and the 1 subscript 
denotes the most significant half and the least significant. Then the following block 
zq := xq + yo; c := (zq < xq)\ Z\ := x\ + y\ + c realises 64-bit addition, providing < denotes 
an unsigned comparison [861 Chap. 2.15]. Without considering the carry flag c, it is not 
clear how one can reconstruct that (2 32 • z\ + Zq) = (2 32 • x\ + xq) + (2 32 • y\ + yo) modulo 
2 64 which is the high-level abstraction of the semantics of the block without resorting to 
a reduced cardinal power construction \27\ Theorem 10.2.0.1]. In such a construction, a 
domain that can express relations such as z$ < xq, henceforth called the base domain, is 
refined with respect to a domain which traces the value of c, an adjunct that is sometimes 
called the exponent domain. This refinement enables c to monitor whether zq < xq holds or 
not. Although a base domain can always been refined in this way, and the transfer functions 
enriched to support the extra expressiveness, an alternative approach is to derive a transfer 
function for a block of instructions which, in cases such as the above, better match against 
what can be expressed in the base domain. 

As a final piece of motivation, it is worth noting that there are several ways of imple- 
menting double-length addition, and numerous ways of realising other commonly occurring 
operations [86], and therefore pattern matching can never yield a systematic nor a reliable 
way of computing transfer functions for basic blocks. 

1.3. Specifying extreme values with universal quantifiers. Monniaux [57J EH] re- 
cently addressed the vexing question of automatic abstraction by focussing on template 
domains [72] which include, most notably, intervals [2] and octagons [56]. He showed that 
if the concrete operations are specified as piecewise linear functions, then it is possible to 
derive transfer functions for blocks using quantifier elimination. To illustrate the role of 
quantification, suppose a piecewise linear function models a block that updates three reg- 
isters whose values on entry and exit are represented by bit-vectors x, y and z and x', y' 
and z' respectively. To derive a transfer function for interval analysis, it is necessary to 
ascertain how the maximal value of x', denoted x' u say, relates to the minimal and maximal 
values of x, y and z, denoted xt and x u , yi and y u and zt and z u respectively. The value 
of be specified in logic [57] by asserting that: 
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• for all values of x, y and z that fall within the intervals x G [a^,£E u ], y G [yi,yu] an d 
z € [z£, z u ], the value of x' u is greater or equal to x' 

• for some combination of values of x, y and z such that x £ [x£,x u ], y £ [y£,yu] and 
z € [zg, z u ], the output x' takes the value of x' u . 

The "for some" can be expressed with existential quantification, and the "for every" with 
universal quantification. By applying quantifier elimination, a direct relationship between 
X£, x u , y£, y u , Z£, z u , and x' u can be found, yielding a mechanism for computing x' u in 
terms of X£, x u , y£, y u , Z£, z u . This construction is ingenious but quantifier elimination 
is at least exponential for rational and real piecewise linear systems [20} [87] , and is doubly 
exponential when quantifiers alternate [31] . Hence, its application requires extreme care 

[an- 

As an alternative to operating over piecewise linear systems [13], one can instead express 
the semantics of a basic block with a Boolean formula; an idea that is familiar in model 
checking where it is colloquially referred to as bit-blasting [22] . First, bit- vector logic is used 
to represent the semantics of a block as a single CNF formula /block ( an excellent tutorial on 
flattening bit- vector logic into propositional logic is given in [51, Chap. 6]). Thus, each 71- 
bit integer variable is represented as a separate vector of n propositional variables. Second, 
the above specification is applied to express the maximal value (or conversely the minimal 
value) of an output bit-vector in terms of the ranges on the input bit- vectors. This gives a 
propositional formula /spec which is essentially /block augmented with universal quantifiers 
and existential quantifiers. Third, the quantifiers are removed from / spe c to obtain / S i mp - a 
simplification of / spe c- Of course, / S i m p is just a Boolean formula and does not prescribe how 
to compute a transfer function. However, a transfer function can be extracted from / S i mp 
by abstracting / S i mp with linear affine equations [38] which directly relate the output ranges 
to the input ranges. This fourth step (which is analogous to that proposed for abstracting 
formulae with congruences [50]) is the final step in the construction. 

This proposal for computing transfer functions [13] may seem attractive since computing 
Vy : tp, where tp is a system of propositional constraints and y is a vector of variables, 
is straightforward when the formula tp is in CNF. When tp is an arbitrary propositional 
system, a CNF formula ip that is equisatisfiable, denoted =, to tp can be found [64] by 
introducing fresh variables z to give tp = 3z : tp. However, then the transfer function 
synthesis problem amounts to solving \ly : 3z : tp where tp is in CNF. To eliminate the 
existentially quantified variables z, resolution [5 1 1 Chap. 9.2.3] can be applied, but the 
quadratic nature of each resolution step compromises tractability as the size of z increases. 
The size of z is proportional to the number of logical connectives in tp which, in turn, 
depends on the size of the bit-vectors and the complexity of the block under consideration. 
It is therefore no surprise that this approach has only been demonstrated for blocks of 
microcontroller code where the word-size is just 8 bits [13j . Although no polynomial-time 
algorithms are known for existential quantifier elimination of CNF, new algorithms are 
emerging [16] which will no doubt permit transfer functions to be derived for larger blocks. 
Nevertheless, it would be preferable if quantifier elimination was avoided altogether. 

1.4. Avoiding quantifier elimination. This paper develops the work reported in [TJJ to 
contribute a method for deriving transfer functions which replaces quantifier elimination 
with successive calls to a SAT solver, where the number of calls grows linearly with the 
word-size rather than the size of the formula that encodes the semantics of the block. 
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To illustrate, consider an octagon [56] which consists of a system of inequalities of the 
form ±x±y < d. For each of these inequalities, our approach derives the least d € Z (which 
is uniquely determined) such that the inequality holds for all feasible values of x and y as 
defined by some propositional formula. As an example, consider the inequality x + y < d. 
The constant d is defined as d = min{c E Z | \/x : Vy : f(x, y) A x + y < c} where f(x, y) is 
a propositional formula constraining the bit-vectors x and y. Furthermore, given a machine 
with word- length w, the maximal value in an unsigned representation of x and y is 2™ — 1, 
and thus we can derive an initial constraint < d A d < 2 • (2 W — 1) for d, which can be 
expressed disjunctively as [i£ V fi u where: 

• ne = 0<dAd<2 w -l 

• IMu = 2™ < d A d < 2 • (2 W - 1) 

To determine which disjunct characterises d, it is sufficient to test the propositional formula 
3x : 3y : f(x,y) A x + y > 2 W for satisfiability. If satisfiable, then is entailed by the 
inequality x+y < d, and fig otherwise. We proceed by decomposing the new characterisation 
into a disjunction — as in dichotomic or binary search — and repeating this step w times to 
give d exactly. Likewise, constants d can be found for all inequalities of the form ±x±y < d, 
which provides a mechanism for computing an octagonal abstraction that describes a given 
propositional formula. The force of this abstraction technique is that it provides a way 
of deriving octagonal guards which must hold for a block to be executed in a particular 
mode. For example, a block might have three modes of operation, depending on whether 
an operation underflows, overflows, or does neither. Which mode is applicable then depends 
on the values of variables on entry to the block, which motivates using guards to separate 
and describe the different modes of operation. Knowing that a particular mode is applicable 
permits a specialised transfer function to be applied for inputs that conform to that mode. 
It is important to note that separating modes is a crucial step in the process of applying 
abstract domains that operate on unbounded integers, such as affine equalities, to describe 
finite bit- vector semantics. As an example, consider incrementing a variable x by 1. If x 
and its representative x' on output are unbounded integers, the affine relation is merely 
x' = x + 1. Now suppose that x and x' are 32-bit variables. Then, if x < 2 32 — 1 it follows 
x' = x + 1, and x 1 = otherwise. Even though each of the two cases can be described in 
the affine domain, the join of these two affine relations conveys no useful information at 
all. Separating modes ultimately leads to a transfer function being formulated as a system 
of guarded updates, where the updates stipulate how the entry values are mapped to exit 
updates, and the guards indicate which mode holds and therefore which type of update is 
applicable. 

This leaves the problem of how to compute the updates themselves; the input-output 
transformers that constitute the heart of the transfer function. We show that updates 
can be also computed without resorting to quantifier elimination. We demonstrate this 
construction not only for intervals, but for transfer functions over octagons. The method is 
based on computing an affine abstraction of a Boolean formula that is derived to describe 
the mode. For intervals, the update details how the bounds of an input interval are mapped 
to new bounds of an output interval. For octagons, the update maps the constants on the 
input octagonal inequalities to new constants on the output inequalities. 

1.5. Contributions. Overall, the approach to computing transfer functions that is pre- 
sented in this paper confers the following advantages: 
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• it is amenable to instructions whose semantics is presented as propositional formulae or 
Satisfiability Modulo Theory (SMT) [10J formulae. The force of this is that such encodings 
are readily available for instructions, due to the rise in popularity of SAT-based model 
checking; 

• it avoids the computational problems associated with eliminating variables from piece- 
wise linear systems and propositional formulae, particularly with regard to alternating 
quantifiers; 

• it proposes the use of transfer functions that are action systems of guarded updates. These 
transfer functions are attractive both in terms of their expressiveness and the ease with 
which they can be evaluated (only one expression need be evaluated for each inequality 
that describes the state on exit from the block); 

• it shows how the modes of a block can be found and how, for a given mode, the guards 
can be computed using repeated SAT solving. It is also shown how the updates for that 
mode can be deriving by interleaving SAT solving with affine abstraction; 

• it shows how update operations, which in the case of interval analysis, compute bounds 
on the output intervals from bounds on the input intervals, need not be linear functions. 
Non-linear update operations can also be supported for transfer functions over octagons. 
In this context, the update operation computes the constants on the output octagonal 
inequalities from the constants on the input inequalities (the coefficients are fixed in both 
the input and output octagons hence computing a transfer function amounts to adjusting 
constants); 

• it explains how to handle operations that underflow, overflow, or do neither and even 
combinations of such behaviours, providing a way to seamlessly integrate template in- 
equalities with finite precision arithmetic. 

2. Outline of the approach 

Overall the paper proposes a systematic technique for inferring transfer functions that 
are defined as systems of guarded updates. This section illustrates the syntactic form of 
transfer functions, so as to provide an outline of the approach and a roadmap for the whole 
paper. The roadmap explains which sections of the paper are concerned with deriving which 
components of the transfer function. 

2.1. Modes. Transfer functions are inferred for blocks, such as the assembly code listing 
in Fig. 13.21 (The approach is illustrated for blocks of 32-bit AVR UC3 assembly code [lj, 
though the techniques are completely generic.) Each instruction is modelled by at least 
one, and at most four, Boolean functions according to whether it overflows or underflows, 
or is exact, that is, whether the instruction neither overflows nor underflows. This division 
into three cases reflects the ways the two's complement overflow (V) flag is set or clearer 
PQ. In exceptional cases this flag is used in tandem with the negative (N) flag pQ and 
thus it is natural to refine these three cases according to whether the negative flag is also 
set or clearer. However, if the instruction overflows then the result is necessarily negative 
whereas if it underflows then the result is non-negative, hence only the exact case needs to 
be further partitioned. This gives four cases in all, overflow, underflow, exact and negative, 
exact and non-negative, each of which can be precisely expressed with a Boolean function 
that describes a so-called mode. The different instructions that make up the block may 
operate in different modes, though the mode of one instruction may preclude a mode of 
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another being applicable. A mode combination is then chosen for each instruction, and 
a single Boolean formula is constructed for the block by composing a formula for each 
instruction in the prescribed mode. If the composed formula is unsatisfiable, then the mode 
combination is inconsistent. Otherwise, the mode combination is feasible and the formula 
describes one type of wrapping (or non-wrapping) behaviour that can be realised within 
the block. 



2.2. Transfer functions. The composed formula is then used to distill a guard paired 
with an update; one pair is computed for each feasible mode combination. For example, the 
block listed in Fig. 13.21 has nine feasible mode combinations in all, yielding nine guard and 
update pairs of the form: 



2 31 < ((rO)) + ((rl» < 2 31 
1 < <(r0)) < 2 31 - 1 
1 < ((rl)) < 2 31 - 1 




H 


r («rOj» 
I (((rO'J 


= -2 31 ^ 

= -2 31 ; 


i A 

i 




2 31 + 1 < ((r0» + ((rl)) < 2 32 - 2 
< <(r0>> < 2 31 - 1 
< ((rl)) < 2 31 - 1 




1-1 


f («rOj» 
I («rCJ 


_ _ 
• = 2 32 - 


«r0»» 
((rOe)) 


-Mu») A 
- «rlj») 



-2 31 + 1 < «r0» + ((rl)} < -1 Alf (((rOj» = -((rO u )) - ((rl*))) A 
< ((rl)) < 2 31 - 1 J \ (((rO'J = -«r0,» - ((rl,))) 

Each guard is a conjunction of linear template constraints over the inputs of the block, in 
this case ((rO)) and ((rl)), which denote the (signed) values of the registers RO and Rl on 
entry to the block. The guards express properties of RO and Rl which must hold for the 
instructions to operate in the modes that make up the feasible mode combination. 

The update operations that augment the guards detail how the values of the registers 
are mutated for a given mode combination. For example, if the first guard is applicable, 
then the update asserts that the output value of RO takes a value in the range [— 2 31 , — 2 31 ] 
(which actually prescribes a single value); the lower and upper bounds of RO on exit are 
denoted {{r0' e )) and ((r0' u )) in the update. The second update illustrates how ((rO^)) and 
((r0' u )) can depend on the values of RO and Rl on entry to the block, where the input lower 
and upper bounds for RO are denoted ((rO^)) and ((rO M )), and likewise for Rl. 



2.3. Automatic derivation and roadmap. The guard is constructed one inequality at 
a time, by applying a form of dichotomic (or binary) search. This step amounts to a series 
of calls to a SAT solver, as is explained in Sect. [3l Updates can be computed by inferring 
an affine relationship between the bound on an output symbolic constraint and the input 
symbolic bounds. Such a relationship can again be derived by repeated SAT solving, as 
detailed in Sect. [H Replicating this construction for each of the symbolic output constants 
gives the update operation for a feasible mode combination. (Sect. [3] and Sect. H] return to 
the example introduced given above, detailing the steps in the derivation of this transfer 
function.) Yet, situations can arise for which the updates cannot be expressed using affine 
relationships, motiving the study, in Sect. [5j of complementary classes of update which 
can be formed from linear template inequality constraints [21] and non-linear template 
equality constraints [23]. Updates that relate symbolic output constants to symbolic input 
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constraints using equalities are complementary to those based on inequalities: both are 
useful when transfer functions are evaluated. Sect. [6] focuses on this topic and explains 
how guards and updates are applied during fixed point computation. Evidence is presented 
in Sect. [7] which demonstrates that the techniques presented in the paper are capable of 
synthesising transfer functions for blocks, where previous approaches based on quantifier 
elimination were prohibitively expensive. Finally, Sect. [8] surveys the related work and 
Sect. [9] concludes. 

3. Deriving Guards 

We express the concrete semantics of a block with Boolean formulae so as to ultimately 
infer a set of guards that distinguish that wrapping behaviour of a block. The construction 
given in [13J formulates this problem using quantification, so that quantifier elimination 
can be applied to solve it. However, whereas universal quantifier elimination is attractive 
computationally, this is not so for the elimination of existentially quantified variables. We 
overcome this problem by reformulating the construction given in [13] , and replace quantifier 
elimination by a series of calls to a SAT solver. This section illustrates the power of this 
transposition by deriving guards for some illustrative blocks of microcontroller instructions. 

3.1. Deriving interval guards by range refinement. Consider deriving a transfer func- 
tion for the operation INC RO, which increments the value of RO by one and stores the result 
in RO. For this example, we assume that the operands are unsigned. We represent the value 
of RO by a bit-vector rO and let (rO) = YlfLo 2* ■ rO[i] where rO[i] denotes the i th element 
of rO. Note that in the sequel the following notational distinction is maintained: RO for a 
register, rO for a bit-vector representing RO and (rO) and ((rO)) for, respectively, the un- 
signed and signed interpretation of the bit-vector rO. The instruction itself can operate in 
one of two modes: (1) it overflows (iff (rO) = 2 32 — 1) or (2) it is exact (otherwise). Note 
that in the sequel the term exact is used to refer to a mode that is neither underflowing nor 
overflowing. The semantics of these two modes can be expressed as two formulae: 

(1) ipo(X) = ^(X)A(A|i rO[ij) 

(2) <p E (X) = ^(X)A(V£o-rO[t]) 

where <p(X) encodes the increment over bit-vectors X = {rO,rO'} as follows: 

Both formulae can be converted into equisatisfiable formulae in CNF by introducing fresh 
variables z [6H [S3]. We therefore denote the resulting formulae by ipe(X, z) and (po{X, z). 
Following our initial approach [13], the transfer function for a multi- modal block (where 
the internal instructions can wrap) is described as a system of guarded updates. In the 
one-dimensional case, octagonal guards coincide with intervals. Each guard constitutes an 
upper-approximation of those inputs that are compatible with the specific mode. In case of 
the increment, we derive guards go and g E defined as: 

(1) go = 2 32 - 1 < (rO) < 2 32 - 1 

(2) g E = < (rO) < 2 32 - 2 

These guards partition the inputs into two disjoint spaces: (1) a single point for the overflow 
case and (2) exact operation. To obtain these guards, we provide an algorithm which solves 
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1 : ADD RO Rl; 2 : MOV R2 RO; 3 : EOR R2 Rl; 4 : LSL R2; 
5 : SBC R2 R2; 6 : ADD RO R2; 7 : EOR RO R2; 

Figure 1: Assembly listing corresponding to the assignment RO' := isign(RO+Rl ,R1) 

a series of SAT instances, rather than following a monolithic all-in-one approach based on 
quantifier elimination [13j . To illustrate our strategy, consider the computation of a least 
upper bound d for (rO) for the formula ip E (X,z). Clearly, we have < d < 2 32 - 1. We 
start by putting: 

ij> E (X,z) = <p E (X,z) A(rO) >2 31 

Recall that we use a binary encoding of integers in the Boolean formulae. Further, as 2 31 is 
a power of two, we can finesse the need for a complicated Boolean encoding of the predicate 
(rO) > 2 31 by using the equivalent formula: 

4% mp '\X,z) = <p E (X,z)ArO[31] 

which is simpler both to formulate and to solve. Then, the satisfiability of ip E mp ' 1 (X , z) 
shows that rO takes a value in the range 2 31 < (rO) < 2 32 — 1. Consequently, d occurs in 
the same range. We can thus further refine this range by testing: 

r E (X,z) = ^(*)A(rO>>(2 31 + 2 30 ) 

for satisfiability, or equivalently: 

ip E mp ' 2 (X,z) = <p E (z) Ar0[31] Ar0[30] 

As tp E mp ' 2 (X,z) is satisfiable, we infer that d satisfies 2 30 + 2 31 < d < 2 32 - 1. The 
method continues to refine the constraint on d into two equally sized halves. Only in 
the last iteration is the satisfiability check found to fail, from which we conclude that 
d = Eiii 2i = 232 - 2 - Overall, this deduction requires 32 SAT instances, but the similarity 
of the instances suggests that the overhead can be mitigated somewhat by incremental SAT. 

3.2. Deriving octagonal guards by range refinement. In a second example, we show 
how to extend the refinement technique from intervals to octagons. To illustrate the method, 
consider the program fragment in Fig. 13.21 This program corresponds to an assignment 
RO' : = isign(RO+Rl ,Ri) for signed values. The function isign assigns abs(R0+Rl) to RO 
if Rl is positive, and -abs(R0+Rl) otherwise. R2 is used as a temporary register. The sum 
of RO and Rl is computed by instruction (1), and instructions (2) - (7) implement isign. 
The semantics of even this simple block is not obvious due to the bounded nature of machine 
arithmetic. For instance, if abs is applied to the smallest representable integer — 2 31 then 
the result is 2 31 subject to overflow, which gives — 2 31 . To derive octagons that describe 
such corner cases, we have to consider all combinations of over- and underflow modes of 
the instructions. In the above program, the instructions ADD (sum) and LSL (left-shift) can 
wrap in different ways, and thus are multi-modal. Neither EOR nor MOV can wrap; they are 
both uni-modal. Note that in general, the instruction SBC (subtract-with-carry) is multi- 
modal. However, in the case of two equal operands, the instruction can only result in or 
— 1, depending on the carry-flag. We thus ignore the wrapping of SBC R2 R2 and consider 
it to be uni-modal for simplicity of presentation. Note that only overflows occurred in the 
previous example since the single operand was unsigned. 
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3.2.1. Finding the feasible mode combinations. In what follows, let n(X) defined as 

n{x) = (a£o r0 'M ° r0 M © rl W © C W) A 

-ic[0] A ^A?£o c t* + 1] (»*0[i] A V (rO[i] A c[i]) V (rl[i] A c[i])) 

denote the Boolean encoding of the instruction ADD R0 Rl over bit- vectors X = {rO, rl, . . . } 
obtained through static single assignment conversion. Here, c is a bit-vector of intermediate 
carry bits. The semantics of ADD RO Rl is to compute the sum of RO and Rl and store the 
result in RO. Since we are now working with signed objects, let 

«*» = (Er=o 2 2 i -^])-2- 1 -^-l] 

denote the value of a bit-vector x of length w, where x[w — 1] is interpreted as the sign-bit. 
Then, ADD RO Rl has four modes of operation: overflow, underflow, exact and non-negative, 
exact and negative. Underflow occurs, for example, if the arithmetic sum of ((rO)) and ((rl)) 
is less than — 2. The constraints for these modes, which are obtained directly from the 
instruction set specification [TJ p. 127], can be expressed as four Boolean formulae: 

Ho{X) = -ir0[31] A irl[31] A r0'[31] 

Hu\x) = r0[31] A rl[31] A -.r0'[31] 

Hp\x) = (r0[31] V rl[31] V -.r0'[31]) A (-.r0[31] V ->rl[31] V r0'[31]) A -.r0'[31] 

= (-.r0[31] V->rl[31] Vr0'[31]) A-rCpl] 

= (-ir0[31] V-irl[31]) A--r0'[31] 

fi N (X) = (r0[31] V rl[31] V -.r0'[31]) A (-.r0[31] V -.rl[31] V r0'[31]) A r0'[31] 

= (r0[31] Vrl[31] V-ir0'[31]) Ar0'[31] 

= (r0[3lj Vrl[3lj) Ar0'[31] 

For example, the formula fj,(X)AfJ,o(X) describes the input-output relationships for ADD RO Rl 
in overflow mode. The instruction LSL R2 shifts register R2 to the left by one bit-position; 
the most-significant bit of R2 is moved into the carry-flag. If the carry-flag is set, an overflow 
occurs; there is no underflow for LSL. Let u{X) A i>o(X) and v(X) A Ve{X) thus express 
the overflow and exact modes of LSL R2. In an analogous way to the first ADD instruction, 
let r)(X) A rjo(X), r)(X) Arju(X), n(X) Anp(X) and rj(X) A t]n(X) express the semantics 
of the instruction ADD RO R2. Using these encodings that satisfy a single mode, we can 
compose a Boolean formula for a fixed mode combination that expresses the possibility of 
one mode of one operation being consistent with another mode of another operation; the 
unsatisfiability of this formula indicates that the chosen modes are inconsistent. For ex- 
ample, the combination of fijj(X), i>e(X) and r]p(X) is infeasible. The above block thus 
constitutes 4 ■ 2 ■ 4 = 32 combinations of modes, but only 9 of which are satisfiable, which is 
depicted in Tab. [TJ It is thus necessary to derive guards only for the feasible combinations. 

3.2.2. Incremental elimination of mode combinations. The number of mode combinations 
in a single basic block is, in the worst case, exponential in the number of instructions in 
the block. The number of calls to a SAT solver required to determine feasibility is thus 
exponential too. Further, incremental SAT solving [89 1, which greatly affects the efficiency 
of modern solvers, cannot be exploited when the feasibility of the mode combinations are 
checked one-by-one. We therefore present a strategy for incrementally checking the feasi- 
bility of mode combinations. To illustrate, let ip(X) encode the instructions of the entire 
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Table 1: 



easible and infeasible modes for the program in Fig. [37 



ADD RO Rl 


LSL R2 


ADD RO R2 


feasible? 


ADD RO Rl 


LSL R2 


ADD RO R2 


feasible? 


o 


E 


o 


no 


p 


E 


o 


no 


o 


E 


u 


no 


p 


E 


u 


no 


o 


E 


p 


no 


p 


E 


p 


VPS 


o 


E 


N 


no 


P 


E 


N 


no 


o 


o 


o 


no 


P 


o 


o 


no 








u 


yes 


P 





u 


no 








p 


no 


P 





p 


yes 








N 


yes 


P 





N 


no 


u 


E 





no 


N 


E 





no 


u 


E 


u 


no 


N 


E 


u 


no 


u 


E 


p 


no 


N 


E 


p 


no 


u 


E 


N 


no 


N 


E 


N 


yes 


u 








no 


N 








no 


u 





u 


no 


N 





u 


yes 


u 





p 


yes 


N 





p 


no 


u 





N 


yes 


N 





N 


yes 



block and consider the case where ADD RO Rl underflows and LSL R2 is exact. The formula 

ip(X)AiJ, u (X)Au E {X) 

describes this compound mode, independent of the second ADD. Since this formula is unsat- 
isfiable is follows that the mode combinations 

ip{X) A ii V {X) A v E {X) A m {X), 

ip(X) A nu(X) A v E {X) A VU (X), 

<p(X) A fiu(X) A u E {X) A rjp(X), 

<p(X) A Hu{X) A v E (X) A r, N {X) 

are also infeasible. This suggests extending the formula f>(X) with mode constraints, such 
as /j,jj(X), in a tree- like fashion, instruction by instruction. A sub-tree, which represents a 
different modes of one instruction, is then created and followed iff the formula is satisfiable. 
This strategy is illustrated in Fig. [2J Observe that the technique may increase the overall 
number of SAT instances to be solved: 36 instead of 32 for the running example. In the 
worst case, if all leaves are reachable, the strategy requires an exponential number of SAT 
calls. However, the tree-like strategy integrates smoothly with incremental SAT solving |89j 
since the additional mode constraints can be passed as assumptions, thereby permitting an 
incremental SAT solver to reuse learnt clauses. Which technique outperforms the other 
strongly depends on the distribution of feasible modes, there is thus no clear winner. 



3.2.3. Deriving guards for the feasible mode combinations. For all feasible mode combina- 
tions, it is still necessary to compute (abstract) guards which describe an over-approximation 
of those inputs that satisfy the respective mode. To illustrate the technique, consider the 
case where instruction (1) underflows, instruction (4) overflows and instruction (6) is exact 
and non-negative. With (f(X) encoding the instructions that constitute the block as before, 
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Figure 2: Incremental elimination of feasible modes 



TRANSFER FUNCTION SYNTHESIS WITHOUT QUANTIFIER ELIMINATION 



13 



Algorithm 1 Compute the least signed value d of the fc-bit vector d = (d[0], . . . , d[k — 1]) 
such that the Boolean formula ip and the inequality Y17=i c * ' (( v i)) — ((^)) both hold; where 
the formula k encodes Y^h=i Ci ' (( v i)) = ((d)) 
Input: 99, k 

1: (f) i— ip A K 

2: {check the sign} 

3: if (/; A -id [A; — 1] is satisfiable then 

4: d <- 

5: (j) <— (f> A -id[fc — 1] 
6: else 

7: d «- -2 fc - 1 

8: (f) <— <f) A d[k — 1] 

9: end if 

10: {iterate over bits k — 2, ... 0} 

11: for i = 1 -> - 1 do 

12: if </> A d[j] is satisfiable then 

13: d «- d + 2 k ~ i - 1 

14: <- A d[i] 

15: else 

16: <- A -.d[j] 

17: end if 

18: + l 

19: end for 

20: return d 



the formula £(X) which encodes this mode combination is thus defined as: 

= ¥J (X)A At p(X)A^(X)A7 ?P (X) 

To derive an octagonal abstraction of the inputs that satisfy £(-X"), first consider the problem 
of computing the least upper bound d for the octagonal expression ((rO)) + {{rl)). To do so, 
let k be a formula encoding ((d)) = ((rO)) + ((rl)) where d is extended to 34 bits to prevent 
wraps in the octagonal expression (cp. [25j Sect. 3.3]). Then, check 

ip\X) = £{X) A k A — id[33] 

for satisfiability to derive a coarse approximation of d. The satisfiability of ^(X) shows 
that d > 0. We thus proceed with testing 

^ 2 {X) = f (X) AkA ^d[33] A d[32] 

for satisfiability. The unsatisfiability of ip 2 (X) indicates d < 2 32 . Next we consider 

^(X) = i(X) AkA ^d[33] A ^d[32] A d[31] 

The unsatisfiability of ip 3 (X) shows d < 2 31 . Then we test 

iP 4 (X) = £(X) AkA -.d[33] A -.d[32] A -.d[31] A d[30] 

This and the ensuing formulae are all satisfiable. The exact least upper bound is thus 
((d)) = 2 30 + 2 29 + . . . + 2° = 2 31 - 1 hence «r0» + ((rl)) < 2 31 - 1. 

Alg. Q] presents this tactic for the general case of maximising a linear expression of n 
variables. The algorithm relies on a propositional encoding for an affine inequality constraint 
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J27=o °i " (( v i)) - d where ci, . . . , c n , d £ Q. To see that such an encoding is possible assume, 
without loss of generality, that the inequality is integral and d is non-negative. Then rewrite 
the inequality as J27=o c t ' ^ d + EiU 1 c 7 ' «««» where ( c i~> • • • > c n ). ( c i~' • • • > c n ) € 

N n and N = {% € Z | < i}. Let c+ = E™^ 1 c+ and c~ = Y%=a c i ■ Since i v i)) € 
[— 2 W_1 , — 1] for each bit-vector it follows that computing the sums X^i=i c i" ' (( v i)) 
and d+^" =1 c~ • with a signed 1+ [log 2 (l + max(2"' -c + , 6+2 w -c - ))] bit representation 
is sufficient to avoid wraps |i3j Sect. 3.2]. Lines 4-9 provide special treatment for the 
sign. Lines 11-20 represent the core of the algorithm. Since the goal is maximisation, the 
algorithm instantiates each bit d with 1, starting with d[k — 2], and checks satisfiability 
of the respective formula. If satisfiable, the bit d[k — i] is fixed at 1, and then the next 
highest bit is examined. If unsatisfiable, the bit d[k — i] can only take the value of 0, and 
the algorithm moves on to maximise the next highest bit. Variants of this algorithm have 
been reported elsewhere [Tl] [23] . 

Repeating this tactic for all five feasible modes, we compute the following optimal 
octagonal guards: 



Here, redundant inequalities, which are themselves entailed by the given guards, are omitted 
for clarity of presentation. Note that if the non-negative and negative sub-cases where 
not distinguished then the feasible modes P (1) , (4) , P (6) and 7V (1) , (4) , 7V (6) would be 
conflated into a single mode, for which the guard would be — 2 31 + 1 < ((rO)) + ((rl)) < 
2 31 — 1 A — 2 31 < ((rl)) < 2 31 — 1 which is almost vacuous. The net effect of such a guard is 
that its accompanying update operation would be applied frequently, possibly unnecessarily, 
inducing a loss of precision. This explains why it is attractive to separate the exact modes 
into two sub-cases. One can imagine enriching the modes by additionally considering, for 
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instance, the zero flag though as yet we have not encountered an example that warrants 
resolving modes to this finer level of granularity. 

3.2.4. Complexity. A total of 4 • 34 + 4 • 33 SAT instances is solved for each octagonal guard. 
This is due to the bit-extended representation for constraints ±v\ ±112 < d, whereas 33 bits 
are used for constraints ±v\ < d. While this may appear large, it is important to appreciate 
that the number of SAT instances grows linearly with the bit-width. By way of comparison 
with |13j . adding a single propositional variable to a formula can increase the complexity of 
resolution quadratically [5TJ Sect. 9.2.3]. 

3.3. Deriving template guards by range refinement. The generality of Alg. [T] hints 
that the approach to deriving guards can be generalised to template inequalities where 
the coefficients are restricted to take a finite range of possible values. Logahedra |45| and 
octahedra |21j satisfy this property, the former being a class of two variable inequality where 
the coefficients are limited a range {-2 k , . . . , -2 3 , -2 2 , — 2 1 , 0, 2 1 , 2 2 , 2 3 , . . . , 2 fe }, and the 
latter being a class of n variable inequality where the coefficients are drawn from {— 1, 0, 1}. 
The approach straightforwardly generalises to other finite classes of inequality, though it 
becomes less attractive as the number of template inequalities increase. 



Transformers over template constraints have been previously formulated using quantifica- 
tion [131 EZJ- To avoid this, we derive affine relationships between output variables and 
input variables. These relations are then lifted to symbolic constraints that detail how the 
bounds of an input interval are mapped to the bounds of an output interval. The technique 
is then refined to support octagons, so as to derive linear relationships between the symbolic 
constants of the input octagon and the symbolic constants of the output octagon. Note that 
Sect. 14.21 and Sect. 14.31 are just given for pedagogical purposes; they build towards Sect. 14.41 
which provides a linear symbolic update operation that is optimal (if any equality relation 
exists between the input and output symbolic constants then it will be found). Sect. 14.21 
and Sect. 14.31 motivate Sect. 14.41 rather than provide technical background, hence the latter 
section can be read independently of the former sections if so desired. 

4.1. Inferring affine equalities. Our algorithm computes an affine abstraction of the 
models for a given mode-combination. To solve for affine input-output relations, let X 
denote the set of bit-vectors as before. Consider the Boolean formula £,(X) for the case 
where (1) underflows, (4) overflows and (6) is exact and non-negative. The process of 
deriving an affine abstraction follows the scheme first presented in |13|. Sect. 3.2]. It starts 
with solving the formula £,(X), which produces a model mi. Suppose the SAT solver yields: 
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We can equivalently write mi as an affine matrix, denoted Mi £ Z 4x5 . 
ordering (rO', rl', rO, rl) on columns, this gives: 



With the variable 



Mi 




1 





_ 2 31 

-1 

-2 31 + 1 
-1 



m 2 



We then add a disequality constraint ((rl)) ^ — 1 to £,(X) in order to obtain a new solution 
that is not covered by Mi. Denote this formula by t;'(X). Then, solving for £,'(X) produces 
a different model m 2 , say: 

(rO'}} = -2 31 + 2 ((rl')) = -3 

;<r0» = -2 31 + 1 ((rl)} = -3 

Joining Mi with M2, which is likewise obtained from ni2, using the algorithm of Miiller- 
Olm and Seidl |60j yields a matrix that describes that affine relations common to both 
models: 

_ 2 3i 
-1 

-2 31 + 1 
-1 



Mi U M 2 



1 







u 



-2 31 + 2 
-3 

-2 31 + 1 
-3 



1 







-1 









,31 



2 31 + 1 



1 



Our algorithm now attempts to find a model that violates the constraint given through the 
last row, that is, ((rO)) = — 2 31 + 1. Adding a disequality constraint to £'(-X") yields a new 
formula £"(X), for which a SAT solver finds a model: 



<(r0» = 

Then, we join Mi U M 2 with M3 to give: 



_ 2 31 

2 31 +4 



((rl'}) 



(Mi U M 2 ) U M 3 



1 




1 





1 
1 




1 






1 
1 







-1 



1 



_ 2 31 

I 

_ 2 31 
_ 2 32 



+ 1 



u 



1 











_ 2 31 

-4 

-2 31 + 4 
-4 



Adding a disequality constraint to suppress ((rl')) — ((rl)) = yields an unsatisfiable formula, 
likewise for ((r0')> + ((rl)) + «r0)) = -2 32 . Indeed, we have 

(MiUM 2 )UM 3 = UieN M i 

where Mj are matrices describing different models m; of £(X). Indeed, an affine summary 
of a mode-combination is in some sense universally quantified, since its relation is satisfied 
by every model. Moreover (Mi U M 2 ) U M3 represents the best affine abstraction of £(X) 
[13\ 150] . Note too that the chain-length in the affine domain is linear in the number of 
variables in the system [38]. Thus, the number of iterations required to compute a fixed 
point is bounded by the number of variables and does not depend on the bit-width. 
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The resulting equations, however, express relationships between variables but not be- 
tween the ranges of the input and output intervals. As it turns out, we can lift (Mi UM2) U 
M3 to an equation system over intervals by applying a set of straightforward transforma- 
tions. This is arguably the most natural way of deriving a transformer for intervals, though 
we shall see that it does not extend well to octagons. 



4.2. Lifting affine equalities to interval updates. We explain how to transform the 
resulting affine system (Mi U M2) U M3 over variables in X into an equation system over 
range boundaries that prescribes an update. To do so, let V C X denote the bit-vectors 
on entry of the block, and likewise let V C X denote the bit-vectors on exit. Further, 
introduce sets of fresh variables 

V e = {rO e , rl e } V u = {rO u , rl u } 

V' e = {rO' e ,rl' e } V' u = {rO' u ,rl' u } 

to represent symbolic boundaries of each bit- vector in V U V' . If necessary transform the 
equations such that the left-hand side consists of only one variable in V' . For the above 
system, this transformation gives: 

«rl'» = «rl» 

«r0'}) = -«r0))-((rl}}-2 32 

These equations imply the following affine relations on interval boundaries: 

«rl'»„ = «rl'»„ «r0')) u = -«rl», - {{r0)) e - 2» 

«rl'», = «rl'», ((rO'h = -«rl»„ - ((r0)) u - 2™ 

To derive such as system, transform each of the original equations into the form 

where v' G V, X v ' > and A„ G Z for all v G V. This can always be achieved due to the 
variable ordering. For example, the system below on the left can be transformed into the 
system on the right by applying elementary row operations: 



-1 
1 




1 



-1 
-1 



Note that the leading coefficients are positive. We then replace each original equation by a 
pair of equations as follows: 

K'-v' u = Y,vex X v •P(\ v ,v) + d 



K'-v't = Y,vex x v ■ (3(-X v ,v) + d 
where the map j3 : Z x V — > (Ve U V u ) is defined as: 

v e : if A < 



/3(\,v) 



otherwise 



The key idea when constructing the upper bound is to replace each occurrence of a variable 
in the original system with its upper bound in case its coefficient is positive, and with its 
lower bound otherwise. This task is performed by /3. An analogous technique is applied 
when defining the lower bound. Applying this technique to all affine systems, we obtain the 
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following five transfer functions over symbolic ranges, rather than concrete variables (with 
the identity constraints on rl' e and rl' u omitted): 

-2 31 ) A 
-2 31 ) 

2 32 - «rOu» - ((rl„») A 

2 32 _ {{rQe)) _ {{rlf))) 

-2 32 - «rO u » - «rl B ») A 
_ 2 32 _ {(p0/)) _ {{rle))) 

0) A 
0) 

((rO e )) + «rl<))) A 
«rO u » + ((rl u ))) 

-((rO u )) - ((rl u ))) A 
-((rOe)) - ((rl e ))) 

((rO t )) + ((rl t ))) A 
((rO u )) + ((rl u ») 

-2 31 ) A 
-2 31 ) 

-((rO u )) - «rl„») A 
-«r0,» - ((rl,))) 

To illustrate the accuracy of this result, consider the application of the transfer function 
ffj(i) Q(4) p(6) to the input intervals defined by: 

((r0*» = -2 31 + 1 «rO u )) = -2 31 + 4 {{rl e )) = -20 «rl„)) = -10 

Then, the above transfer function defines the output intervals by modelling the wrap that 
occurs in the first instruction ADD R0 Rl to give ((r0' e )) = -2 31 + 6 and ((r0' u )) = -2 31 + 
19. When multiple guards are applicable, however, a merge operation need be applied to 
combine the results of the different updates. This looses information. Further details of the 
evaluation mechanism are discussed in section [6j 

It is interesting to compare this with how an interval analysis would proceed for the 
block which, recall, is listed in Fig. 13.21 Initially, the R0, Rl and R2 would respectively be 
assigned the intervals [-2 31 + 1, -2 31 + 4], [-20, -10] and [-2 31 , 2 31 - 1] the third interval 
being vacuous. The ADD R0 Rl instruction will assign R0 to [2 31 — 19, 2 31 — 6] simulating an 
underflow and MOV R2 RO will update R2 to [2 31 - 19, 2 31 - 6]. The E0R R2 Rl instruction 
will then reassign R2 to [-2 31 , -2 31 + 2 29 - 1] which is adjusted to [0,2 30 - 1] by LSL R2. 
In a carefully constructed interval analysis the transfer function for LSR R2 will also assign 
the carry flag to 1. In such an analysis, the instruction SBC R2 R2 might even assign R2 
to [-1, -1] rather than a wider interval. Under this assumption ADD R0 R2 will update R0 
to [2 31 — 20, 2 31 — 7]. Then, since the sign bit is clear and following 26 high bits of R0 are 
set for all values in the interval [2 31 — 20, 2 31 — 7], a transfer function for E0R R0 R2 could 
conceivable assign RO to [— 2 31 , — 2 31 + 31]. 



/ow.ow.af 6 ) 

foW,OW,N( 6 ) 

fuw ,o( 4 ),p( 6 ) 

fuW, 0< 4 ),Af( 6 ) 

/p(l) )E (4) jP (6) 

/pW,OW,P( 6 ) 
fNW,EW,N( 6 ) 
J JVW ,0< 4 ) ,£/( 6 ) 

/jv(i),o( 4 ),jv( 6 ) 



(«rOj» = 

(((rO'J = 
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(«rOj» = 

(«*<)) = 

(«rOj» = 

««» = 

(«rOj» = 

(«*<» = 

(«r0$» = 

(((rO'J = 

(«rOj» = 

««» = 

(((rOj)> = 

(((rO'J = 

«(rOi» = 

(((rO'u)) = 
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Table 2: Intermediate results for inferring exact affine transformers for octagons 
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4.3. Lifting affine equalities to octagonal updates. Consider now the more general 
problem of deriving a transfer function for octagons for ADD RO Rl ; LSL RO where ADD 
and LSL operate in exact non-negative modes. Computing the affine relation for this mode- 
combination gives (((rO')) = 2 • ((rO)) + 2 • ((rl))) A {{{rl')) = {{rl))). We aim to construct an 
update that maps octagonal input constraints with symbolic constants to octagonal outputs 
likewise with symbolic constants of the form: 
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We start by constructing an update operation that uses the unary input constraints only, 
which appear above the bar separator. We modify the method presented in Sect. 14.21 so 
as to express output constraints in terms of symbolic variables di, ... ,di from the input 
constraints. We obtain the four output unary constraints by an analogous technique as 
before by substituting the symbolic minima and maxima for the symbolic output constants. 
The binary output constraints are derived by linear combinations of the unary output 
constraints. Since the output constraints do not use relational information from the inputs, 
such as ((rO)) + ((rl)) < d^, we obtain a sub-optimal update. To illustrate, suppose < 
((rO)) < 4, < ((rl)) < 1 and ((rO)) + ((rl)) < 4. Then we derive: 

< «r0')> < 10 < ((rl')) < 1 < «r0')) + ((rl')) < 11 

An optimal transfer function, however, would derive ((rO')) < 8 and ((rO')) + ((rl')) < 8. 
Although the above method fails to propagate the effect of some inputs into the outputs, 
it retains the property that the update can be constructed straightforwardly by lifting the 
affine relations. In what follows, we will describe how to derive more precise affine relations 
for the outputs. 

4.4. Inferring affine inequalities for octagonal updates. To derive more precise affine 
updates for octagons, let £(-X") denote the propositional encoding for ADD RO Rl; LSL RO 
where again ADD and LSL operate in exact non-negative modes. Consider inequality ((rO')) < 
d^ in the output octagon and in particular the problem of discovering a relationship between 
d[ and the symbolic constants d%, ■ ■ ■ , dg of the input octagon, as detailed previously. 
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We proceed by introducing signed 34-bit vectors di,... ,dg to represent the symbolic 
constants d\, . . . , d$. Further, let k denote a Boolean formula that holds iff the eight in- 
equalities ((rO)) < ((di)), ((rO)) — ((rl)) < ((dg)) simultaneously hold. Furthermore, let 
77 denote a formula that encodes the equality {(rO')) = ((dj_)) where d^ is a signed bit-vector 
representing d[. Presenting the compound formula k A £(X) A rj to a SAT solver produces 
a model: 

m x = { («)) = 1, ((d x )) = 1, ((da)) = 1, . . . , «d 7 )) = 1, ((d 8 )) = 1 } 

which is fully detailed in Tab. The assignment ((di)) = 1 does not necessarily represent 
the maximum value of ((d^)) for the partial assignment ((di)) = 1, . . . , ((dg)) = 1. Thus let 
Ci denote a formula that holds iff ((d%)) = 1, . . . , ((dg)) = 1 all hold. Then range refinement 
can be applied to find the maximal value of ((d^)) subject to k A £(-X") A rj A (. This gives 
((dj)) = 2 and a model: 

m' x = {((d' 1 )) = 2,((d 1 )) = l,...,((d 8 )) = l} 

An affine summary of all such maximal models can be found by interleaving range refinement 
with affine join. Thus suppose the matrix Mi is constructed from by using the variable 
ordering (d[, d%, . . . , dg) on columns: 



Mi 



100000000 
010000000 
001000000 
000100000 
000010000 
000001000 
000000100 
000000010 
000000001 



The method proceeds in an analogous fashion to before by constructing a formula [i that 
holds iff ((dg)) ^ 1 holds. Solving the formula kA( (X) A rj A \i gives the model rrj-2 detailed 
in Tab. [2j The model m,2, itself, defines a formula £2 that is equi-satisfiable with the 
conjunction of ((di)) = 3, . . . , ((dg)) = 0. Maximising ((d[)) subject to k A £(X) A 77 A C2 
gives ((d'i)) = 10 which defines the model 

m' 2 = {(«)) = 10, «d!) 

and M2, which in turn yields the join Mi U M2 



3, . . . , ((dg)) 
as follows: 



0} 



Mi UM 2 



-1 














1 





Repeating this process two more times then gives: 

= {((d' 1 )) = 26,((d 1 )) = 8,...,((d 8 )) = 0} 
= {K)) = 6,((d a )) = 0,...,((d 8 )) = 3} 



m 



m 
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Mi U M 2 U M 3 



10 -2 000 
1 -1 1 -1 



Mi U M 2 U M 3 U M 4 = [1 -2 | 0] 

The system Mi U M 2 U M 3 U M 4 then expresses the relationship {((![)) = 2 • ((d 5 )). In 
summary, each iteration of the algorithm involves the following steps: find a model of 
K A i{X) A 7] A /i where \x ensures that the model is not already summarised by U^ =1 Mj; 
apply range refinement to maximise ((d' x )) whilst keeping ((d x )), ((di)), . . . , ({dg)) fixed; join 
the resulting model with Ll| =1 Mj to give U^Mj. 

To verify that ((d' x )) = 2 ■ {{d$)) is a fixed point, unlike before, it not sufficient to impose 
the disequality ((d^)) ^ 2 • ((ds)) and check for unsatisfiability. This is because ((d' x )) is 
defined through maximisation. Instead the check amounts to testing whether k A £(-X") A r] 
is unsatisfiable when combined with a formula encoding the strict inequality ((d'i)) > 2- ((ds)) 
(note that if ((d[)) > 2 • ((d 5 )) holds then it follows that ((d[)) ^ 2 • ((d 5 )) holds). Since the 
combined system is unsatisfiable, we conclude that the update for this mode-combination 
includes d[ = 2 ■ d$. The complete affine update consists of: 

d[ = 2-d 5 d' 5 = 2-d 5 + d 2 

d' 2 = d 2 dg = 2 • d§ + c?4 

4 = 2 • d 6 d' 7 = 2-d 6 + d 2 

d 4 = C?4 dg = 2 • G?5 + d 4 

This result is superior to that computed in Sect. 14.31 To illustrate, consider again an input 
octagon defined by < ({rO}) < 4, < ((rl)) < 1 and ((rO)) + ((rl)) < 4, hence: 

d x = 4 d 3 = 

d 2 = l d 4 = 

4 = 4 

Applying the computed transformer to derive d' 5 on output gives: 

4 = 2-4 + 1 = 9 

Hence, we have {(rO')) + ((rl')) < 9, whereas the previously discussed technique based on 
applying the (3 map yields ((rO')) + {{rl')) < 11. Indeed, these linear symbolic update 
operations are optimal in the sense that if a symbolic output constant d'j is equal to a linear 
function of the symbolic input constants d\, . . . , dg, then that function will be derived. 

Interestingly, Mine |56^ Fig. 27] also discusses the relative precision of transfer functions, 
though where the base semantics is polyhedral rather than Boolean. Using his classification, 
the transfer functions derived using the synthesis techniques presented in Sect. 14.31 and 
Sect. 14.41 might be described as medium and exact. The following theorem confirms this 
intuition. For ease of presentation, the result states the exactitude of the update on the 
symbolic constant d'^, analogous results hold for updates on d' 2 , ■ ■ ■ , d' 8 . 

Theorem 4.1. Suppose an octagonal update is derived of the form M(d 1; d±, . . . , dg, —1) = 
0. Moreover suppose that 

• for all values of ((r0», ((rl)) such that ((rO)) < di, . . . , ((rO)) - ((rl)) < d 8 and £(X) hold 
it follows that ((rO')) < c + c\ ■ d\ + . . . + eg ■ d 8 holds 

• for all values of ((rO)), ((rl)) there exists a value of ((rO')) such that ((rO)) < d\, 
((rO)) - ((rl)) < dg, £(X) and ((r0'» = c + c x ■ d x + . . . + c 8 • dg hold 

Then M(d' 1; d\, . . . ,dg, —1) = \= d[ = c + c\ ■ d\ + . . . + eg ■ dg 
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Proof. Suppose that M is derived by M = Mi U M2 U . . . U M^. Suppose Mi is constructed 
from the model mi = {d^ = v[,di = vi, . . . ,dg = v$} where the value v[ is maximal. Yet 
mi is derived from a formula that encodes the equality {{rO')) = ((d^)) where is a signed 
bit-vector representing d!±. Since v[ is maximal it follows that the value of ((rO')) is maximal 
hence (d[ = v[) A (d\ = v\) A ... A [dg = vs) \= d[ = c + c\ ■ d\ + . . . + eg ■ ds by the two 
assumptions. Therefore Mi(d' 1 , d\, . . . , ds, — 1} = (= d[ = c + c\ ■ d\ + . . . + cs • d$ since 
Mi = [I I {v[,vi, . . . ,vs)]. Likewise Mi{d[,di, . . . ,d 8 , -1) = (= d[ = c+ci -di + . . . + c 8 -d$ 
for all 1 < i < I. The result follows since M is the least upper bound of Mi, M2, . . . , M^ 
whereas d\ = c + c\ ■ d\ + . . . + eg ■ dg is an upper bound. □ 



5. Deriving Updates with Templates 

The previous section showed how linear equalities can be used to relate a symbolic constant 
of an inequality in the output octagon to the symbolic constants on the inequalities of the 
input octagon. In this section we develop complementary techniques for updates that cannot 
be characterised in this way. To illustrate the problem, Sect. 15.11 introduces an example 
which demonstrates why it can be propitious to base updates on symbolic bounds (range) 
constraints. Then, Sect. 15.21 refines this observation, demonstrating the role of octagonal 
inequalities in constructing updates, while Sect. 15.31 shows how equality constraints can be 
combined with auxiliary variables [3"1 1241 150] . to derive non-linear relationships between an 
output constant and the symbolic input constants. These techniques all share the use of 
templates, either in the syntactic form of the linear inequalities, or the terms that arise in 
the non-linear equalities. 

5.1. Bounds constraints. To illustrate the problem with affine updates, consider the 
following code block: 

1 : AND RO 15; 2 : AND Rl 15; 3 : XOR RO Rl; 4 : ADD RO Rl; 

The operations AND and XOR are uni-modal; ADD is multi-modal but it only operates in the 
exact non-negative mode for this block. Since the AND instructions truncate to contents of 
RO and Rl to the values stored in their low bytes (an operation which is non- linear), no affine 
relationship exists between the symbolic constants di that characterise the input octagon 
and those d[ that characterise the output octagon. However, observe that it is still possible 
to find a bound on d^. In fact, range refinement, as detailed in Sect. 13.2.31 can be applied 
to maximise ((rO')) to infer ((rO')) < 30, hence the update d[ = 30. Repeating this tactic 
for remaining the symbolic output constants yields: 

d[ = 30 d' 3 = 4 = 45 d' 7 = 

d' 2 = 15 4 = 4 = 4 = 15 
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5.2. Octagonal inequality constraints. Ranges are merely a degenerate form of octag- 
onal inequality, which suggests using octagons to relate an output d\ to an input dj. To 
illustrate this idea, consider the following code that rounds RO up the next multiple of 16: 

1 : MOV Rl RO; 2 : NEG Rl; 3 : AND RO 15; 4 : ADD RO Rl; 

The instruction NEG Rl computes the two's complement of Rl, updating Rl with the result. 
The instructions NEG Rl and ADD RO, Rl are multi-modal, thus consider the feasible mode 
in which both instructions are exact, the former being negative and the latter non-negative. 
To search for a relationship between d[ and dx, the expression ({rO')) — ((rO)) is maximised 
to infer ((rO')) - ((rO)) < 15, hence {{rO')) < ((rO)} + 15 thus the update d[ = d x + 15. 
Maximising the remaining expressions 

«r0')) - (+«rl» + «r0») 
«r0'» - (+«rl») «r0'» - (-((rl)} - «r0») 

«r0'» - (-«r0») «r0'» - {-{{rl)) + «r0») 

«r0'» - (-((rl))) {(rOT)) - (+«rl» - «r0») 

in general, derives invariants of the form d[ < d 2 + c, . . . , d\ < d 7 + c where c is some 
constant, either of which can be strengthened to an equality and interpreted as an update. 
However, in this case, these additional updates do not yield any further useful bounds on 
d x . Observe too that some of the above expressions involve 3 variables, whereas some of 
expressions that bound d' 5 ,d' 6 ,d' 7 and d' 8 involve 4 variables, even though the updates are 
themselves octagonal. Completing this derivation for the above example yields: 



d'x 


= d\ + 15 


4 


= di + 30 


d' 2 


= 15 




= d 3 


d' 3 


= d 3 


d 7 


= d 3 + 15 


d' A 


= 


d's 


= d x + 15 



5.3. Non-linear equality constraints. Relaxing the equalities to inequalities provides 
one degree of freedom for generalising updates; relaxing linear equalities to non-linear ones 
provides another. Polynomial extensions [60, Sect. 6] have been proposed for generalising 
linear equality analysis, and there is no reason why this technique cannot be adapted to the 
problem of deriving transfer functions. 

5.3.1. Generating non-linear equality constraints. The idea is to augment the original vari- 
ables in the block with fresh variables specifically introduced to denote non-linear terms. 
The terms are drawn from a finite language of templates that typically includes monomials 
up to a fixed degree. To illustrate, consider the following basic block which computes the 
location an offset relative to the start location of two-dimension array where the registers 
RO and Rl represent the row and column coordinates (which are indexed from 0). Register 
R2 represents row size; all registers are signed. 

1 : MUL RO R2; 2 : ADD RO Rl; 

Assume the block is described as a Boolean formula <£>{X) and all operations are exact. As 
before, the values of RO, Rl and R2 on input are represented using bit-vectors rO, rl and 
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r2, whereas rO' denotes the value of RO on output. Passing ip(X) to a solver yields a model 
mi as before, say: 

mi = { «r0» = 2, «rl» = 4, ((r2)) = 3, ((r0'» = 10 } 

Instead of directly representing these values as an affine system, auxiliary variables are 
introduced whose sole purpose is to represent some non-linear terms drawn from a set of 
templates. In this case, we have a set that contains the single non-linear term ((rO)) ■ {{r2)), 
hence we introduce a fresh variable s defined as s = ((rO)) ■ ((r2)). Since ((rO)) = 2 and 
((r2)} = 3, it follows s = 6. With the variable ordering (rO', rO, rl, r2, s) on columns, we 
obtain the following affine system Mi £ Z 5x6 as follows: 



1 














10 





1 











2 








1 








4 











1 





3 














1 


6 



Now the procedure proceeds much like before. The formula <p(X) is now augmented with 
the constraint ((rO)) • ((r2)) ^ 6, the resulting formula being denoted ip'(X). (Propositional 
encodings have been suggested for systems of inequality constraints over polynomial terms 
whose size is quadratic in the number of symbols required to define the constraints \35\ 
Theorem 7].) Passing <p'(X) to a SAT solver yields a model 1112: 

m 2 = { «r0)) = 3, ((rl)) = 4, «r2» = 8, «r0')) = 28 } 

which implies that s = 24. The merge is then computed thus: 



Mi U M 2 = 



1 











10 ~ 




" 1 














28 





1 








2 







1 











3 





1 








4 


U 








1 








4 








1 





3 













1 





8 











1 


6 
















1 


24 


1 


-18 













26 " 















5 





-1 


7 




















1 








4 























18 


-5 


24 













and the formula further augmented thus: 

= <p'{X) A (18 • «r2» - 5 • «r0» • «r2» + 24) 
This formula gives another model: 

m 3 = { «r0» = 2, ((rl)) = 2, «r2» = 2, «r0'» = 6 } 
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From this we deduce s = 4 and hence obtain: 



Mi U M 2 U M 3 



1 


-18 








-26 " 





5 





-1 


7 








1 





4 











18 -5 


24 


1 


-18 


-2 





-34 





10 


1 


-2 


18 








4 


-18 5 


-8 



u 



1 








18-«r2)) + 5-«r0))-«r2)))^-8 

«r0'» 



Mi U M 2 U M 3 U M 4 



By passing: 

V "(X) A (4 • «rl» 

to the solver, we generate yet another model: 

m 3 = { «r0» = 4, «rl» = 5, ((r2)} = 2, «r0'» = 13 } 

Joining M4 with Mi U M 2 U M3 then produces the system: 

" 12 -64 -70 10 -152 " 
64 -12 70 -23 152 

Proceeding with one more iteration, we obtain the system: 

M1UM2UM3UM4UM5 = [1 -1 -1 1 ] 

Since adding another disequality constraint yields an unsatisfiable system, the equation: 

«r0')) = «rl» + a = «rl» + «r0» • «r2» 

characterises the polynomial input-output relation implemented by this block. Observe that 
the total number of calls to a SAT solver is still linear in the number of overall variables 
- program variables plus auxiliary variables — due to linear chain lengths in affine spaces. 
Our prototype implementation written in JAVA on top of the [mc] SQUARE framework |73j 
and Sat4J [52] computes this update in no more than 0.25s. 



5.3.2. Lifting non-linear equality constraints to intervals. Of course, as in the Sect. 14. 1[ the 
derived constraint relates neither internal bounds nor symbolic constants on the input and 
output octagons. One would expect that non-linear equality constraints can be straightfor- 
wardly lifted to updates using either the technique described in Sect. 14.31 or by using the 
maximisation technique explained in Sect. I4.4L However, this is not so. 

To illustrate, consider polynomial extension applied to interval relations, and in partic- 
ular the problem of lifting the above affine system to construct an update over the symbolic 
bounds rOi, rO u , rig, rl u , r2g, r2 u , r0' e and r0' u . Observe that the original equation 

«r0'» = «rl» + «r0» • «r2» 

gives rise to two updates: 

«rOj» = «rli» + min{«rO,» • «r2,», «r0<» • «r2 u », «r0„» ■ ((r2 £ )), ((rO u )) ■ ((r2 u ))} 
((rO'J = ((rl u )) + max{«rO £ }} ■ «r2<», ((rO e )) • «r2 u », «rO„» • «r2 £ », ((rO u )) ■ ((r2 u ))} 

that involve, respectively, minimisation and maximisation operations. These operations are 
required because it is not until the symbolic bounds are instantiated that the relative sizes 
of the non-linear terms can be compared. (These comparisons are redundant for linear 
terms because they are monotonic.) 
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To present this transformation formally, let V U V' denote the input and output vari- 
ables, and S denote a set of templates (monomials) over the variables V . Thus if s € S then 
s = Uf =1 Vi for some i/j € V. We introduce a map fi(s) = {n™ =1 u>j | Wi = vn V W{ = Vi u } so 
that, for example, if s = ((rO)) • ((r2)) then: 

/*(«r0» • «r2») = {«r0<» • «r2,», «r0,» ■ «r2„)), «rO u » • «r2*», «r0„» • «r2 u »} 

Each of the polynomially extended equations take the form: 

K'-v' = J2vev X v ■ v + J2 se s A « ' s + d 
where v' € V, \ v > G N, and A„eZ for all v £ V, and A s eZ for all s £ S 1 . 

We then replace each polynomially extended equation by a pair of equations as follows: 

K'-v't = J2vev x v ■ P{-Xv,v) + E se 5 A s • 7(-A s ,s) + d 

where (3 is defined as before (in Sect. I4.2p and 7 transforms the monomials as follows: 

( A s ) = { min (M s )) : if A < 
1 max(/i(s)) : otherwise 

Note that linear terms are transformed in the same manner as before. 



5.3.3. Non-linear equality constraints and octagons. The minimisation and maximisation 
terms that arise in interval updates suggest a tactic for inferring updates for octagons in 
the presence of non-linear terms. To illustrate with the above example, the construction 
proceeds by introducing fresh variables s\ and S2 defined such that: 

si = max(di • d 3 , d\ ■ d 6 , d 5 ■ d 3 , d 5 ■ d 6 ) s 2 = min(di • d 3 ,d 1 ■ d 6 , d 5 ■ d 3 , d 5 ■ d e ) 

Then maximisation is interleaved with affine join, as detailed in Sect. 14.41 s ° as to derive 
updates between a d\ variable, the dj and the auxiliary s% and s 2 variables. By applying 
this technique the following transfer function is derived: 
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Thus, for example, if the octagonal describes a cube that is offset from the origin, namely 
d\ = d,2 = d% = 3 and d^ = d& = d§ = —2, then bound on ((rO 1 )), denoted d[, is calculated 
by d[ = d 2 + si = 3 + max(3 -3,3- -2, -2 • 3, -2 • -2) = 12. 

6. Evaluating Transfer Functions 

Thus far, we have described how to derive transfer functions for intervals and octagons 
where the functions are systems of guards paired with affine updates, without reference to 
how they are evaluated. In our previous work [13], the application of a transfer function 
amounted to solving a series of integer linear programs (ILPs). To illustrate, suppose a 
transfer function consists of a single guard g and update u pair and let c denote a system 
of octagonal constraints on the input variables. A single output inequality in the output 
system, c', such as rO' + rl' < d' 5 , can be derived by maximising rO' + rl' subject to the 
linear system c A g A u. To construct d in its entirety requires the solution of 0(n 2 ) ILPs 
where n is the number of registers (or variables) in the block. Although steady progress 
has been made on deriving safe bounds for integer programs [63] , a more attractive solution 
computationally would avoid ILPs altogether. 

6.1. A single guard and update pair. Affine updates, as derived in Sect. 14.41 relate 
symbolic constants on the inequalities in the input octagon to those of the output octagon. 
These updates confer a different, simpler, evaluation model. To compute rO' + rl' < d' 5 in 
d it is sufficient to compute c\l g [56] which is the octagon that describes the conjoined 
system c A g. This can be computed in quadratic-time when g is a single inequality and 
in cubic-time otherwise [56]. The meet cfl g then defines values for the symbolic constants 
di, though these values may include — oo and oo. The value of d' 5 is defined by its affine 
update, that is, as a weighted sum of the di values. If there is no affine update for d' 5 , 
then its value defaults to oo. If bounds have been inferred for output octagons, then the 
d\ can possibly be refined with a tighter bound. This evaluation mechanism thus replaces 
ILP with arithmetic that is both conceptually simple and computationally efficient. This is 
significant since transfer functions are themselves computed many times during fixed point 
evaluation. 

6.2. A system of guard and update pairs. The above evaluation procedure needs to be 
applied for each guard g and update u pair for which c\~\g is satisfiable. Thus several output 
octagons may be derived for a single block. We do not prescribe how these octagons should 
be combined, for example, a disjunctive representation is one possibility [36]. However, the 
simplest tactic is undoubtedly to apply the merge operation for octagons [56J (though this 
entails closing the output octagons). 

6.3. A system of updates for template inequality constraints. Evaluating an octag- 
onal update represented as an affine equality as discussed in Sect. 16.11 is straightforward 
since each symbolic bound d[ on output is characterised by exactly one linear equation. 
This is not necessarily the case if template inequality constraints have been applied to de- 
rive updates, as discussed in Sect. 15.21 Recall, for example, that inequalities of the form 
d'i < d 2 + c, . . . , d[ < d-j + c can arise, all of which potentially induce non-trivial bounds on 
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d'x- In general, a symbolic constant d[ in the output octagon might be related to the input 
symbolic constants di, . . . , d n through a system of m inequalities: 



where Cj > otherwise the inequality does not bound d[ from above and can thus be dis- 
carded. Although any of these inequality can be strengthened to an equality and interpreted 
as an update, it is more precise to compute: 



I L \fc=i / J ) 

Therefore, in general, transfer function evaluation can involve the evaluation of several linear 
expressions for each symbolic constant in the output octagon. 



We have implemented the techniques described in this paper in Java using the Sat4J 
solver [52) . so as to integrate with our analysis framework for machine code [73], called 
[mc] square, which is also coded in Java. All experiments were performed on a MacBook 
Pro equipped with a 2.6 GHz dual-core processor and 4 GB of RAM, but only a single 
core was used in our experiments. 

To evaluate transfer function synthesis without quantifier elimination, Tab. [3] compares 
the results for intervals for different blocks of assembly code to those obtained using the 
technique described in [13]. This corresponds to the techniques presented in Sect. [3] and 
Sect.Hl Column #instr contains the number of instructions, whereas column #bits gives the 
bit-width. (The 8-bit and 32-bit versions of the AVR instruction sets are analogous.) Then, 
#affine presents the number of affine relations for each block. The columns runtime contain 
the runtime and the number of SAT instances. The overall runtime of the elimination-based 
algorithm [13] is given in column old (oo is used for timeout, which is set to 30s). Transfer 
function synthesis for blocks of up to 10 instruction is evaluated, which is a typical size for 
microcontroller code. For these size blocks, we have never observed more than 10 feasible 
mode combinations. 

7.1. Comparison. Using quantifier elimination, all instances could be solved in a reason- 
able amount of time for 8-bit instructions. However, only the small instances could be 
solved for 32 bits (and only then because the Boolean encodings for the instructions were 
minimised prior to the synthesis of the transfer functions). It is also important to appreciate 
that none of the timeouts was caused by the SAT solver; it was resolution that failed to 
produce results in reasonable time. By way of comparison, synthesising guards for different 
overflow modes requires most runtime in our new approach, caused by the fact that the 
number of SAT instances to be solved grows linearly with the number of bits and quadrati- 
cally with the number of variables (the number of octagonal inequalities is quadratic in the 
number of variables). Computing the affine updates consumes only a fraction of the overall 
time. In terms of precision, the results coincide with those previously generated |13j . 

The block for swap is interesting since it consists of three consecutive exclusive-or 
instructions, for which there is no coupling between different bits of the same register. 
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Table 3: Experimental results for synthesis of transfer functions 



block 


#instr 


#affine 


#bits 


runtime 


guards / #SAT 


affine / #SAT 


overall 


old 


inc 


1 


2 


8 

32 


0.2s / 32 
0.5s / 128 


0.1s / 5 
0.2s / 5 


0.3s 
1.0s 


0.2s 
23.0s 


inc+shif t 


2 


3 


8 

32 


0.3s / 48 
0.8s / 192 


0.1s / 8 
0.2s / 8 


0.4s 
1.0s 


0.3s 

oo 


swap 


3 


1 


8 

32 




0.1s / 3 
0.1s / 3 


0.1s 
0.1s 


0.1s 
0.2s 


inc+f lip 


4 


2 


8 

32 


0.2s / 32 
0.9s / 128 


0.2s / 5 
0.3s / 5 


0.4s 
1.2s 


0.5s 

oo 


abs 


5 


3 


8 

32 


2.5s / 216 
6.5s / 792 


0.3s / 8 
0.3s / 8 


2.8s 
6.8s 


0.8s 

oo 


inc+abs 


6 


3 


8 

32 


2.6s / 216 
6.7s / 792 


0.3s / 8 
0.3s / 8 


2.9s 
7.0s 


1.4s 

oo 


sum+isign 


7 


9 


8 

32 


5.9s / 648 
19.7s / 2376 


0.2s / 24 
0.4s / 24 


4.3s 
11.1s 


4.5s 

oo 


exchange* 
abs 


10 


3 


8 

32 


2.8s / 216 
7.2s / 792 


0.3s / 8 
0.3s / 8 


3.1s 
7.5s 


9.5s 

oo 



The block is also unusual in that it is uni-modal with vacuous guards. These properties 
make it ideal for resolution. Even in this situation, the new technique scales better. In 
fact, the Boolean formulae that we present to the solver are almost trivial by modern 
standards, the main overhead coming from repeated SAT solving rather than solving a 
single large instance. Sat4J does reuse clauses learnt in an earlier SAT instances, though 
it does not permit clauses to be incrementally added and rescinded which is useful when 
solving maximisation problems [13]. Thus the timings given above are very conservative; 
indeed Sat4J was chosen to maintain the portability of [mc] square rather than for raw 
performance. Nevertheless, these timings very favourably compare with those required to 
compute transfer functions for intervals using BDDs [65], where in excess of 24 hours is 
required for single 8-bit instructions. Our experiences |1 1 1. 166] with native solvers such as 
MiniSat, however, indicate that a tenfold speed-up can be achieved by replacing Sat4J. 

7.2. Deriving octagonal transfer functions. The process of deriving octagonal transfer 
functions by lifting (Sect. 14. 3p requires an imperceivable overhead compared to computing 
affine relations themselves, indeed it is merely syntactic rewriting. The runtimes required for 
inferring affine updates by alternating range refinement and affine join (Sect. liT4"|) . however, 
is typically 3 or 4 times slower than those of computing the guards; the number of symbolic 
constants on the output inequalities corresponds exactly to the number of input guards. 
Since the octagon on input consists of 8 guards, and so does the octagon on output, the 
worst case requires 16 + 1 iterations of affine abstraction and refinement; a single iteration 
of refinement is no more expensive as in the cases given in Tab. [3j and the affine join has 
imperceivable impact. We have observed the full number of iterations is only needed for 
programs for which there is no affine relation between octagons on input and output. We 
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refrain from giving exact times for the afflne updates since they were computed with Z3 
[32j rather than Sat4J and thus are not directly comparable. 

7.3. Further optimisations. Since transfer functions are program dependent, one could 
first use a simple form of range analysis I23 |, [66] to over-approximate the ranges a register 
can assume. These ranges can be encoded in the formulae, thereby pruning out some mode- 
combinations. For example, it is rarely the case that the absolute value function is actually 
applied to the smallest representable integer. 

8. Related Work 

The problem of designing transfer functions for numeric domains is as old as the field of 
abstract interpretation itself [26], and even the technique of using primed and unprimed 
variables to capture and abstract the semantics of instructions and functions dates back 
to the thesis work of Halbwachs |43| . However, even for a fixed abstract domain, there 
are typically many ways of designing and implementing transfer functions. Cousot and 
Halbwachs [29\ Sect. 4.2.1], for example, discussed several ways to realise a transfer function 
for assignments such as x = y x z in the polyhedral domain while abstracting integer division 
x = y/z is an interesting study within itself |77j . 

The problem of handcrafting best transformers is particularly challenging and Granger |40J 
lamented the difficulty of devising precise transfer functions for linear congruences. How- 
ever, it took more than a decade after Granger's work before it was observed that best 
transformers could automatically be constructed for domains of finite height [68]. Neverthe- 
less, automatic abstraction (or the automatic synthesis of abstractions) has only recently 
become a practical proposition, due to emergence of robust decision procedures [131 EH EZ] 
and efficient quantifier elimination techniques [51] [59] . 

8.1. Generation of symbolic best transformers. Transfer functions can always be 
found for domains of finite height using the method of Reps et al. [68], provided one is 
prepared to pay the cost of repeatedly calling a decision procedure or a theorem prover, pos- 
sibly many times on each application of a transformer. This motivates applying a decision 
procedure in order to compute a best transformer offline, prior to the actual analysis [l3|[50]. 
so as to both simplify and speedup their application. 

Our previous work [13] shows how bit-blasting and quantifier elimination can be applied 
to synthesise transformers for bit-vector programs. This work was inspired by that of 
Monniaux [57, 59] on synthesising transfer functions for piecewise linear programs. Although 
his approach extends beyond octagons [80], it is unclear how to express some instructions 
(such as bit- wise exclusive-or) in terms of linear constraints. Universal quantification, as 
used in both approaches, also appears in work on inferring linear template constraints [42] . 
There, Gulwani and his co-authors apply Farkas' lemma in order to transform universal 
quantification into existential quantification, albeit at the cost of completeness since Farkas' 
lemma prevents integral reasoning. However, crucially, neither Monniaux nor Gulwani 
et al. provide a way to model integer overflow and underflow. Our work explains how to 
systematically handle wrap-around arithmetic in the transfer function itself (without having 
to the revise the notion of abstraction |78| ) whilst sidestepping quantifier elimination too. 

Transfer functions for low-level code have been synthesised for intervals using BDDs |18] 
by applying interval subdivision where the extrema representing the interval are themselves 



TRANSFER FUNCTION SYNTHESIS WITHOUT QUANTIFIER ELIMINATION 



31 



represented as bit-vectors [65]. If g : [0, 2 8 — 1] — > [0,2 8 — 1] is a unary operation on an 
unsigned byte, then its abstract transformer / : D — > D on D = {0} U {[£,«] | < ^ < 
u < 2 8 } can be denned recursively. If i = u then f{[£, u]) = g(£) whereas if £ < u then 
f([£,u]) = f([£,m- 1]) Uf([m,u\) where m = [u/2 n \2 n and n = [log 2 (u - £ + 1)J . Binary 
operations can likewise be decomposed by repeatedly dividing squares into their quadrants. 
The 8-bit inputs, £ and it, can be represented as 8-bit vectors, as can the 8-bit outputs, so 
as to represent / with a BDD. This permits caching to be applied when / is computed, 
which reduces the time needed to compute a best transformer to approximately 24 hours 
for each 8-bit operation. It is difficult to see how this approach can be extended to blocks 
that involve many variables without a step-change in BDD performance. 

The question of how to construct a best abstract transformer has also been considered in 
the context of Markov decision processes (MDPs) for which the first abstract interpretation 
framework has recently been developed [85]. The framework affords the calculation of both 
lower and upper bounds on reachability probabilities, which is novel. The work focuses 
on predicate abstraction [39], that have had some success with large MDPs, and seeks to 
answer the question of, for given set of predicates, what is the most precise abstract program 
that still is a correct abstraction. More generally, the work illustrates that the question of 
how to compute the best abstract transformer is pertinent even in a probabilistic setting. 

8.2. Modular Arithmetic. The classical approach to handling overflows is to follow the 
application of a transfer function with overflow and underflow checks; program variables 
are considered to be unbounded for the purposes of applying the transfer function but then 
their sizes are considered and range tests and, if necessary, range adjustments are applied to 
model any wrapping. This approach has been implemented in the Astree analyzer [121128]. 
However, for convex polyhedra, it is also possible to revise the concretisation map to reflect 
truncation so as to remove the range tests from most abstract operations [19[ [75] . Another 
choice is to deploy congruence relations [101 E] where the modulus is a power of two so 
as to reflect the wrapping in the abstract domain itself [61]. This approach can be applied 
to find both relationships between different words [61] and the bits that constitute words 
[i~5"l l4"9l [50] (the relative precision of these two approaches has recently been compared [33] ) . 
Bit-level models have been combined with range inference |11[ 123]. though neither of these 
works address relational abstraction nor transfer function synthesis. 

Modular arithmetic can be modelled with case splitting by introducing a propositional 
variable that acts as a witness to an overflow. To illustrate, consider the 8-bit comparison 
x + 100 < 10 [5H Sect. 6.4]. To model overflow a witness p 44> [x + 100 < 255) is defined, 
which is used to control case selection. Case selection is realised through two constraints 
defined hy p => (x + 100) < 10 and (->p) =>■ ((a; + 100) — 256) < 10. Case-based axiomisations 
can even be used to model underflows and rounding-to-zero in IEEE-745 floating-point 
arithmetic as shown in [57[ Sect. 4.5]. These ideas are similar in spirit to those given in this 
paper for decomposing a block into its modes which are selected by guards. 

8.3. Polynomial Relations. The last decade has seen increasing interest in the derivation 
of polynomial invariants, with techniques broadly falling into two classes: methods that use 
algebraic techniques to operate directly over polynomials and methods that model polyno- 
mial invariants in a linear setting. The work of Colon [24] is representative of the latter, for 



32 



J. BRAUER AND A. KING 



he shows how polynomial relations of bounded degree can be derived using program trans- 
formation. To illustrate, suppose a variable a is updated using the assignment a = a + 1. A 
variable s is introduced to represent the non-linear term a 2 and the program is extended by 
replacing the assignment a = a + 1 with the parallel assignment {a, s) = (a + 1, s + 2a + 1) 
so as to reflect the update on a to s. Linear invariants between a, s and the other variables 
in the transformed program then are reinterpreted as polynomial invariants. The idea of 
using nonlinear terms as additional independent variables also arises in the work of Bagnara 
et al. [3] who use convex polyhedra to represent polynomial cones of bounded degree and 
thereby derive polynomial inequalities. They reduce the loss of precision induced through 
linearisation by additional linear inequalities, which are included in the polyhedra to express 
redundant non-linear constraints. The idea of extending a vector of variables with non-linear 
terms also arises in the work of Muller-Olm and Seidl [60] who consider the complexity of 
inferring polynomial equalities up to a fixed degree. They represent an affine relation with 
a set of vectors that generate the space through linear combination. Extending this idea to 
variables that represent non-linear terms naturally leads to the notion of polynomial hull 
which is not dissimilar to the closure algorithm that is used in this paper for computing 
non-linear update functions. 

Quantifier elimination has been proposed as a technique for inferring polynomial in- 
equalities directly [U] in which the invariants are templates of polynomial inequalities with 
undetermined coefficients. Deriving coefficients for the templates amounts to applying quan- 
tifier elimination which can be computed using a parametric (or comprehensive) Grobner 
basis construction [88]. This approach resonates with the technique proposed by Monniaux 
for inferring loop invariants [58J. Grobner bases also arise in techniques for calculating 
invariants that are based on fixed point calculation |69|. I70|. the main advantage of this ap- 
proach being that it does not assume any a priori bound on the degree of a polynomial as an 
invariant. Polynomial analysis has also been applied in the field of SAT-based termination 
analysis [35] using term rewriting [37} I83j . Their work provides techniques for encoding 
polynomial equality and inequality constraints in propositional Boolean logic. 

8.4. Procedure summaries. Abstracting the effect of a procedure in a summary is a key 
problem in inter-procedural analysis [76J since it enables the effect of a call on abstract 
state to be determined without repeatedly tracing the call. The challenge posed by sum- 
maries is how they can be densely represented whilst supporting the function composition 
and function application. Gen/kill bit-vector problems [67] are amenable to efficient repre- 
sentation, though for other problems, such as that of tracking two variable equalities [62], 
it is better not to tabular the effect of a call directly. This is because if a transformer is 
distributive, then the lower adjoint of a transformer uniquely determines the transformer 
and, perhaps surprisingly, the lower adjoint can sometimes be represented more succinctly 
than the transformer itself. 

Acceleration |38l l53l l5l"t 174] is attracting increasing interest as an alternative way of 
computing a summary of a procedure, or more exactly the loops that it contains. The idea 
is to track how program state changes on each loop iteration so as to compute the trajectory 
of these changes (in a computation is that akin to transitive closure) and hence derive, in 
a single step, a loop invariant that holds on all iterations of the loop. 

Symbolic bounds, which are key to our transfer functions, also arise in a form of sym- 
bolic bounds analysis [71 j that aspires to infer ranges on pointer and array index variables in 
terms of the parameters of a procedure. Lower and upper bounds on each program variable 
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at each program point are formulated as linear functions of the parameters of the function 
where the coefficients are themselves parametric. The problem then amounts to inferring 
values for these parametric coefficients. By assuming variables to be non-negative, inequal- 
ities between the symbolic bounds can be reduced to inequalities between the parametric 
coefficients, thereby reducing the problem to linear programming. 

9. Concluding Discussion 

9.1. Synopsis. This article discusses the problem of automatically computing transfer func- 
tions for programs whose semantics is defined over finite bit-vectors. The key aspect that 
distinguishes our work from existing techniques [I3j [57l [59] is that it does not depend 
on quantifier elimination techniques at all. Although Boolean formulae presented in CNF 
initially appear attractive for this task because of the simplicity of universal quantifier elim- 
ination |13[ Sect. 1.3], their real strength is the fact that they are discrete. This permits 
linear equalities and inequalities to be inferred by repeated (incremental) satisfiability test- 
ing, avoiding the need for quantifier elimination in the abstraction process entirely. Most 
notably, this technique sidesteps the complexity of binary resolution. The force of this obser- 
vation is that it extends transfer function synthesis to architectures whose word size exceeds 
8 bits, thereby strengthening the case for low-level code verification [6] 17] 18] |9"1 [T7 1 IM 1 I73" l [82] . 

9.2. Future work. The problem of synthesising transfer functions is not dissimilar to that 
of inferring ranking functions for bit-vector relations [25] . Given a path ir with a transition 
relation r n (x, x'), proving the existence of a ranking function amounts to solving the formula 
3c : Vie : Va/ : r n (x,x') — > (p(c,x) < p(c, x')) where p(c,x) is a polynomial over the 
bit-vector x and c is a bit-vector of coefficients |25[ Thm. 2]. However, if intermediate 
variables y are needed to express r n (x,x'), p(c,x), p(c,x') or <, then the formula actually 
takes the form 3c : Vx : Va/ : 3y : v where 3y : v is equisatisfiable to r 7T (x,x') — > 
(p(c, x) < p(c,x')). This formula is structurally similar to those solved in |13] by quantifier 
elimination, which begs the question of whether this problem — like that of transfer function 
synthesis — can be recast to avoid elimination altogether. We will also investigate whether 
transfer functions can be found, not only for sequences of instructions, but also for entire 
loops [UJ [57] . Existing approaches for the specification of (least inductive) loop invariants 
rely on existential quantification [57, Sect. 3.4], and the natural question is thus whether a 
variation of the techniques proposed in this paper can annul this complexity. 

An interesting open question is whether the techniques discussed in this paper can be 
further generalised to linear template constraints with variable coefficients. As discussed 
in Sect. 13. 3[ the dichotomic search can be applied to any template constraint of the form 
Ya=i c i- v i< d, where 

c ij • • • ) c m d €= Z are constants and v±, . . . , v n are variables. However, 
some interesting abstract domains used in program analysis — such as two variables per 
inequality 179. [80] — do not fall into this class. It is still unclear if and how such relations can 
be derived using binary search. It is also interesting to note that octagons derived using 
our approach are tightly closed [561 Def. 3]. Intuitively, this means that all hyperplanes 
defined through inequalities actually touch the enclosed volume. However, the octagons 
may contain redundant inequalities, which may negatively affect performance [2, Sect. 3.2]. 
It will therefore be interesting to evaluate if simplification is worthwhile [21 Sect. 6.1] and, 
if so, whether non-redundant octagons can be directly derived using SAT. 
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